The Integrated Delivery of Large-Scale Data Mining: The ACSys Data Mining Project
نویسندگان
چکیده
Data Mining draws on many technologies to deliver novel and actionable discoveries from very large collections of data. The Australian Government's Cooperative Research Centre for Advanced Computational Systems (ACSys) is a link between industry and research focusing on the deployment of high performance computers for data mining. We present an overview of the work of the ACSys Data Mining projects where the use of large-scale, high performance computers plays a key role. We highlight the use of large-scale computing within three complimentary areas: the development of parallel algorithms for data analysis, the deployment of virtual environments for data mining, and issues in data management for data mining. We also introduce the Data Miner's Arcade which provides simple abstractions to integrate these components providing high performance data access for a variety of data mining tools communicating through XML.
منابع مشابه
Automatic Discovery of Technology Networks for Industrial-Scale R&D IT Projects via Data Mining
Industrial-Scale R&D IT Projects depend on many sub-technologies which need to be understood and have their risks analysed before the project can begin for their success. When planning such an industrial-scale project, the list of technologies and the associations of these technologies with each other is often complex and form a network. Discovery of this network of technologies is time consumi...
متن کاملA Geometric View of Similarity Measures in Data Mining
The main objective of data mining is to acquire information from a set of data for prospect applications using a measure. The concerning issue is that one often has to deal with large scale data. Several dimensionality reduction techniques like various feature extraction methods have been developed to resolve the issue. However, the geometric view of the applied measure, as an additional consid...
متن کاملApproximate resistivity and susceptibility mapping from airborne electromagnetic and magnetic data, a case study for a geologically plausible porphyry copper unit in Iran
This paper describes the application of approximate methods to invert airborne magnetic data as well as helicopter-borne frequency domain electromagnetic data in order to retrieve a joint model of magnetic susceptibility and electrical resistivity. The study area located in Semnan province of Iran consists of an arc-shaped porphyry andesite covered by sedimentary units which may have potential ...
متن کاملAn Integrated DEA and Data Mining Approach for Performance Assessment
This paper presents a data envelopment analysis (DEA) model combined with Bootstrapping to assess performance of one of the Data mining Algorithms. We applied a two-step process for performance productivity analysis of insurance branches within a case study. First, using a DEA model, the study analyzes the productivity of eighteen decision-making units (DMUs). Using a Malmquist index, DEA deter...
متن کاملA Novel Method for Selecting the Supplier Based on Association Rule Mining
One of important problems in supply chains management is supplier selection. In a company, there are massive data from various departments so that extracting knowledge from the company’s data is too complicated. Many researchers have solved this problem by some methods like fuzzy set theory, goal programming, multi objective programming, the liner programming, mixed integer programming, analyti...
متن کامل